180 Amsterdam'S Report

Author

Joseph H

1 Intoduction

This report is one of 493,845 that I will make, and one of 104,070,413 that could be made.
I “toke” the 1.4 TB Linked-In data that was breached in 2020, and turned it into some insights to power my job HUNT.
The insights I could share in this report, that are also related to my goals, are:
- Industry base recruitment trend.
- Company base workforce timeline.
- Current/part workforce info:
- Basic info: Name, job title, status, social link. I could add geo-location for some that have the data, but it would look creepy.
- Their work period.
- Their experiences.

2 About me

Salutations; I’m Joseph, a self-taught data analyst, engineer, and scraper.
Despite life’s challenges, my goal remains a remote job, either full or part-time, and having friends to tackle the challenges of this changing world with.
To show my skills and dedication, I made this project that yielded this tailored report.

3 About the project

3.1 How this project comes to life?

You would know by now, from my email, that I am hunting for a job.
About a year ago, I scraped contact info from Google Map to get my first job. Later I scraped contact from Linked-In website… you can check how that went in here.

Recently, I finally got to learning SQL because of DuckDB, it is a software that allows you to process big data in your local machine by using storage space as RAM; Then I remembered about a leaked Linked-In data that I couldn’t process.
Thus my journey started to learn SQL, process the data, and make something out of it.

3.2 The process

The process was done in my local machine, and it was as followed.

3.2.1 Downloaded the leaked data

I downloaded the data from a torrent.
There was around 700 .gz file, each is around 280 Mb; 196 GB in total.
Each .gz file contain a 2 GB file; 1.4 TB in total.
Each file have multiple lines, and each one of them is a JSON; Not the file is a JSON, it just have multiple JSONs, one in each line.

3.2.2 Processing the weird data

I this phase I created a script that automatically open an archive, process the file, and save it as a Parquet file with compression level of 22.
I used Python, Pathlib, Polars, and a lot of patience.
The process toke around 20 minutes per file, in total it toke around three weeks (I had to shutdown my PC at night) The result was 700 parquet files, each is around 190 Mb; 133 GB in total.

3.2.3 Making relational database

The data in the datasets were nested, especially the “experience” field, it had the experience of a person and the company info; The problem is that the company info get repeated multiple tiles, across all datasets.
Making a relational database will solve this, and make the exploratory data analysis easier.
The code was split in two:
1. I used Polars to split each of the 700 datasets into mini relational databases.
2. I used DuckDB to merge all the mini relational databases and remove duplicates in some, mainly company and university information’s.

The result was a relational database that is 73 GB in size; From 1.4 TB to 73 GB.

All of this is using my PC, so servers were harmed, only my CPU fan and my ear.

3.2.4 Filter

I filtered out companies base on their industry, country, and whether I have the email of one of the higher ups.

4 General graphs

4.1 marketing and advertising indestry’s yearly new recruit count

4.2 180 amsterdam’s workforce status over the years

5 Workforce sample

5.1 Abraham Turner

Job title: Global account manager
Associated: True
Socials: https://linkedin.com/in/abraham-turner-8a669649 | https://facebook.com/100004460479272

5.1.1 Abraham Turner’s working period at 180 amsterdam

5.1.2 Gantt plot of Abraham Turner’s experience


5.2 Akvilina Jaskunaite

Job title: Business affairs manager
Associated: True
Socials: https://linkedin.com/in/akvilina-jaskunaite-6b541946

5.2.1 Akvilina Jaskunaite’s working period at 180 amsterdam

5.2.2 Gantt plot of Akvilina Jaskunaite’s experience


5.3 Andre Teixeira

Job title: Diretor
Associated: False
Socials: https://linkedin.com/in/andre-teixeira-30b6554b

5.3.1 Andre Teixeira’s working period at 180 amsterdam

5.3.2 Gantt plot of Andre Teixeira’s experience


5.4 Benjamin Bregeault

Job title: Senior creative
Associated: False
Socials: https://linkedin.com/in/benjaminbregeault | https://linkedin.com/in/benjamin-bregeault-632b4b2a

5.4.1 Benjamin Bregeault’s working period at 180 amsterdam

5.4.2 Gantt plot of Benjamin Bregeault’s experience


5.5 Dee Ramadan

Job title: Freelance photographer
Associated: False
Socials: https://linkedin.com/in/dee-ramadan-539a2795

5.5.1 Dee Ramadan’s working period at 180 amsterdam

5.5.2 Gantt plot of Dee Ramadan’s experience


5.6 Houssin Ghanmi

Job title: Laam
Associated: True
Socials: https://linkedin.com/in/houssin-ghanmi-8744b188

5.6.1 Houssin Ghanmi’s working period at 180 amsterdam

5.6.2 Gantt plot of Houssin Ghanmi’s experience


5.7 Ingrid Tappin

Job title: Digital media producer
Associated: False
Socials: https://facebook.com/ingridtappin | https://linkedin.com/in/ingrid-tappin-188b78 | https://angel.co/drs-ingrid-tappin | https://twitter.com/ingridtappin | https://linkedin.com/in/ingridtappin

5.7.1 Ingrid Tappin’s working period at 180 amsterdam

5.7.2 Gantt plot of Ingrid Tappin’s experience


5.8 Israel Santiago

Job title: Dancarino
Associated: True
Socials: https://linkedin.com/in/israel-santiago-a0715769

5.8.1 Israel Santiago’s working period at 180 amsterdam

5.8.2 Gantt plot of Israel Santiago’s experience


5.9 Joe Craig

Job title: Creative director
Associated: True
Socials: https://linkedin.com/in/joe-craig-688b6b1a

5.9.1 Joe Craig’s working period at 180 amsterdam

5.9.2 Gantt plot of Joe Craig’s experience


5.10 Kerry Murphy

Job title: Freelance motion graphics designer
Associated: False
Socials: https://linkedin.com/in/kerrymurphy | https://twitter.com/k3rrymurphy

5.10.1 Kerry Murphy’s working period at 180 amsterdam

5.10.2 Gantt plot of Kerry Murphy’s experience


5.11 Leonie Wardenaar

Job title: Freelance redacteur
Associated: True
Socials: https://linkedin.com/in/leonie-wardenaar-493baa23

5.11.1 Leonie Wardenaar’s working period at 180 amsterdam

5.11.2 Gantt plot of Leonie Wardenaar’s experience


5.12 Lucas Andrade

Job title: 1.000
Associated: True
Socials: https://linkedin.com/in/lucas-andrade-71b9aa56

5.12.1 Lucas Andrade’s working period at 180 amsterdam

5.12.2 Gantt plot of Lucas Andrade’s experience


5.13 Mark Kenny

Job title: It manager
Associated: True
Socials: https://flickr.com/people/markkenny | https://facebook.com/markkennynl | https://vimeo.com/markkenny | https://twitter.com/markkenny | https://linkedin.com/in/markkenny

5.13.1 Mark Kenny’s working period at 180 amsterdam

5.13.2 Gantt plot of Mark Kenny’s experience


5.14 Mohamad Banihani

Job title: Stydant
Associated: True
Socials: https://linkedin.com/in/mohamad-banihani-3bbba06b

5.14.1 Mohamad Banihani’s working period at 180 amsterdam

5.14.2 Gantt plot of Mohamad Banihani’s experience


5.15 Natalie Macarthur

Job title: Senior advertising manager
Associated: False
Socials: https://facebook.com/natalieamacarthur | https://twitter.com/nmacarthur | https://linkedin.com/in/natalie-macarthur-3765235

5.15.1 Natalie Macarthur’s working period at 180 amsterdam

5.15.2 Gantt plot of Natalie Macarthur’s experience


5.16 Rachel Perry

Job title: Producer and project manager
Associated: False
Socials: https://linkedin.com/in/raperry | https://linkedin.com/in/rachel-perry-36706510

5.16.1 Rachel Perry’s working period at 180 amsterdam

5.16.2 Gantt plot of Rachel Perry’s experience


5.17 Ricardo Adolfo

Job title: Copywriter
Associated: False
Socials: https://linkedin.com/in/ricardoadolfo

5.17.1 Ricardo Adolfo’s working period at 180 amsterdam

5.17.2 Gantt plot of Ricardo Adolfo’s experience


5.18 Richard Cashdan

Job title: Senior freelance copywriter
Associated: False
Socials: https://linkedin.com/in/richard-cashdan-1a4a2315

5.18.1 Richard Cashdan’s working period at 180 amsterdam

5.18.2 Gantt plot of Richard Cashdan’s experience


5.19 Roald Van Oosten

Job title: Workspace
Associated: True
Socials: https://facebook.com/roald.vanoosten | https://linkedin.com/in/rvanoosten

5.19.1 Roald Van Oosten’s working period at 180 amsterdam

5.19.2 Gantt plot of Roald Van Oosten’s experience


5.20 Sarr Gor

Job title: Mae gor
Associated: True
Socials: https://linkedin.com/in/sarr-mame-gor-15b8a89b

5.20.1 Sarr Gor’s working period at 180 amsterdam

5.20.2 Gantt plot of Sarr Gor’s experience


5.21 Stephane Lecoq

Job title: Creative director
Associated: True
Socials: https://facebook.com/stephane.lecoq.946 | https://linkedin.com/in/stephane-lecoq-5b2b3710

5.21.1 Stephane Lecoq’s working period at 180 amsterdam

5.21.2 Gantt plot of Stephane Lecoq’s experience


5.22 Taelor Ekaterini

Job title: Sale at 180 amsterdam
Associated: True
Socials: https://linkedin.com/in/taelor-ekaterini-3318a260

5.22.1 Taelor Ekaterini’s working period at 180 amsterdam

5.22.2 Gantt plot of Taelor Ekaterini’s experience


5.23 Terri Dipaolo

Job title: Business affairs manager
Associated: False
Socials: https://linkedin.com/in/terri-dipaolo-9399444

5.23.1 Terri Dipaolo’s working period at 180 amsterdam

5.23.2 Gantt plot of Terri Dipaolo’s experience


5.24 Terry Doyle

Job title: Janitor
Associated: True
Socials: https://linkedin.com/in/terry-doyle-20277528

5.24.1 Terry Doyle’s working period at 180 amsterdam

5.24.2 Gantt plot of Terry Doyle’s experience


5.25 Toon Leysen

Job title: Art director
Associated: True
Socials: https://twitter.com/toonleysen | https://linkedin.com/in/toonleysen

5.25.1 Toon Leysen’s working period at 180 amsterdam

5.25.2 Gantt plot of Toon Leysen’s experience


5.26 Youssef Amekrane

Job title: Amekraney@gmail.com
Associated: True
Socials: https://linkedin.com/in/youssef-amekrane-9232594b

5.26.1 Youssef Amekrane’s working period at 180 amsterdam

5.26.2 Gantt plot of Youssef Amekrane’s experience